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ABSTRACT 

We interpret and model the statistical weak lensing measure ments around 1 30,00 groups and 
clusters of galaxies in the Sloan Digital Sky Survey presented bv I Sheldon et alJ (|2007h . We present 
non-parametric inversions of the 2D shear profiles to the mean 3D cluster density and mass profiles in 
bins of both optical richness and cluster z-band luminosity. Since the mean cluster density profile is 
proportional to the cluster-mass correlation function, the mean profile is spherically symmetric by the 
assumptions of large-scale homogeneity and isotropy. We correct the inferred 3D profiles for systematic 
effects, including non-linear shear and the fact that cluster halos are not all precisely centered on their 
brightest galaxies. We also model the measured cluster shear profile as a sum of contributions from the 
brightest central galaxy, the cluster dark matter halo, and neighboring halos. We infer the relations 
between mean cluster virial mass and optical richness and luminosity over two orders of magnitude in 
cluster mass; the virial mass at fixed richness or luminosity is determined with a precision of ~ 13% 
including both statistical and systematic errors. We also constrain the halo concentration parameter 
and halo bias as a function of cluster mass; both are in good agreement with predictions from N-body 
simulations of LCDM models. The methods employed here will be applicable to deeper, wide-area 
optical surveys that aim to constrain the nature of the dark energy, such as the Dark Energy Survey, 
the Large Synoptic Survey Telescope and space-based surveys. 

Subject headings: gravitational lensing - galaxies: clusters - large-scale structure - cosmol- 
ogy:obscrvations - galaxies: halos - dark matter 
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1. INTRODUCTION 

Clusters of galaxies are among the most promis- 
ing probes of cosmology and of the ph ysics of struc- 
ture f ormation. Theoret i cal ca lculation (jGunn fc Gottl 
[1971 iPress fc Schechten fl97l followed by numeri- 
cal simulations with ever-increas i ng re solution (e.g. 
iNavarro et all 11997k lEvrard et all I2002D have led to 
a robust, quantitative framework for the understand- 
ing of the non-linear growth, collapse, and evolution 
of dark-matter halos. Rich clusters are now confi- 
dently associated with the most massive, collapsed ha- 
los. N-body simulations predict the abundance of halos 
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(|Sheth fc TormeiJ 1199ft IWarren et al1l2006f) . their den- 
sity profiles dNavarro et al.lll997l ) , thei r concentrations 
(iBullock etalJ 120011: lEke et all 120011: IWechsler et all 
I2002t iMaccio et all l2007t iNeto et al l 120071). and their 
large-scale clustering (iKaiserl |1984| \Mo et all 119961: 
iSeliak fc Warr^l2004al : fWetzel et al.ll2007h . The abun- 
dance of dark matter halos is a strong function of 
the cosmological parameters, especially as . the nor- 
malization of the matter power spectrum dWhite et all 
IT9931 : IvTana fc Liddldll999l iBahcall et~aT1l2003h . More- 
over, the evolution of the cluster abundance with red- 
shift is quite s ensitive to the equation of state of 
the dark energ y (Haiman et al. 2001: Hutcrc r" fc Turner! 
120011 : iNewman et alJl2002t iLevine et alll2002t l. An accu- 
rate measurement of the cluster abundance can thus be 
used to determine cosmological parameters. 

However, to exploit clusters as cosmological probes re- 
quires knowledge of the relation between their observable 
properties and their masses — so far, a measurement 
of the cluster mass-observable relation with the neces- 
sary robustness and precision has been lacking. Various 
methods have been employed to detect clusters and to 
estimate their masses; each has advantages and disad- 
vantages, and it is likely that in the future they will be 
increasingly used in combination. 

Measurements of X-ray flux and temperature pro- 
files, combined with the assumption that the X-ray 
emitting gas is in hydrostatic equilibrium (HSE) in a 
spherically symmetric gravitatio nal potential, can be 
used to infer cluster ma ss profiles (jReiprich fc Bohringerl 
120021 : iNagai et al.ll2007D. However, recent XM M-Newton 
and Chandra data (jMarkevitch fc Vikhlininll2007l ) have 
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shown that a fraction of clusters have complex luminos- 
ity and temperature structure, perhaps associated with 
recent merger or AGN activity, calling int o question the 
spher ical HSE assumption in those cases (E vrard et alj 
1996). In addition, inference of the mass profile in HSE 
requires measurement of the radial gas temperature pro- 
file, which in turn r equires large numbers of X-r ay pho- 
tons, so only near by (|Sanderson fc Po nman 2004T) or very 
massive clusters (jAllen et al.l 120071 ) are suitable for this 
treatment. Future X-ray observatories such as XEUS 
and Constellation-X will have greatly improved sensitiv- 
ity and will therefore be able to probe lower-mass clusters 
with this technique. 

The Sunyaev-Zel'dovich (SZ) eff ect, another gas- 
based method of dete cting clusters (|Grego et al.l l200il ; 
ICarlstrom et al.ll2002[ ), has the advantage of being es- 
sentially redshift independent. Theoretically, the inte- 
grated Sunyaev-Zel'dovic h flux increment i s tightly cor - 
related with cluster mass (jMotl et al.ll2005l ; lNagaill2006fl . 
and the slope of the relation appears to be insensi- 
tive to gas dyn amics in cluster cor es. Challenges for 
this technique (|Hallman et al.l [2006) include the iden- 
tification ^_and_j^m^y^l of_contamination by radio point 
sources (I Vale fc White! |200H ). Recent SZ measurements 
(|LaRoque et al l l2006h will soon be supplemented by 
studies from the APEX-SZ, the Sunyaev-Zel'dovich Ar- 
ray, and by large surveys with, e.g., the Atacama Cos- 
mology Telescope and the South Pole Telescope. 

Dynamical cluster mass estimates, using the estimated 
velocity dispersion of cluster galaxies, are also useful, 
but they require many spectroscopic measurements per 
cluster. The interpretation of the velocity dispersion as 
a measure of the cluster mass also usually requires as- 
sumptions about dynamical equilibrium and about the 
distribution of galaxy orbits (velocity anisotropy), al- 
though techniques to bypass these assumptions by sim- 
ulating cl uster galaxy dynam ics directly have also been 
employed (|Evrard et al.|[2TJ07t ). Dynamical estimates are 
also subject to uncertainty in the relation between galaxy 
and dark matter velocity dispersion, called velocity bias, 
which in principle requires inclusion of gas dynamics 
and stellar feedback to properly simulate. Recent work 
indicates that this effect is s mall, but depends on the 
type of galaxy sam pled (e.g. iNagai fc Kravtsovl 120051 : 
iDiemand et~aUl2004D . 

Gravitational lensing has proven an effective tool in 
probing the masses of clusters. Due to the simplicity 
of the gravitational physics of lensing, it has become 
one of the most secure ways of demonst rating the ex- 
istence of dark matter (jClowe et al.l l2006h . Strong lens- 
ing, using multiple images and arcs, can provide precise 
cluster mass estim ates on small scales (Hammer] 119911 : 
iKneib et al.lll996l ). However, strong lensing only occurs 
in very massive clusters; moreover, strong lensing clus- 
ters may not be typical of clusters of their mass, since the 
existence of arcs requires high central mass concentra- 
tions. Weak lensing has been used to cons truct projected 



Ide Putter fc Whitdl200l . since they are sensitive to all 
mass along the line of sight to the source galaxies, not 
just that associated with the cluster. Weak lensing mass 
esti mates are also affect ed by the "mass-sheet" degener- 
acy (jBradac et al.ll2003 ): adding a constant mass sheet 
to the 2D mass density does not change the weak lensing 
shear. 

Fortunately, to use clusters to constrain cosmological 
parameters, determination of the masses of individual 
clusters is unnecessary, since cosmological predictions of 
structure formation are statistical in nature. Cosmo- 
logical theory robustly predicts the halo mass function 
n(M; z,9i), where stands for a vector of cosmologi- 
cal parameters. Astronomical observations measure the 
abundance of clusters sorted by some observable prop- 
erty O, n(0; z). To compare theoretical predictions with 
observations, we need to measure or constrain the con- 
ditional probability distribution, P(0\M; z), that a dark 
matter halo of mass M at redshift z will be observed as a 
cluster with observable O in a given survey, including se- 
lection effects and b i ases. T his is the approach employed, 
e.g., bv lRozo et alj (|2007af ), who adopt the halo occupa- 
tion distribution (HOD) description of this conditional 
probability distribution and marginalize over the HOD 
model parameters to arrive at cosmological constraints. 
Alternatively, one could rely on, e.g., hydrodynamic or 
semi-analytic galaxy formation models to directly predict 
n(0; z, 9i), but the theoretical uncertainties — which are 
roughly captured in the HOD model — are still large. 

The method of "cross-correlation weak lensing" pro- 
vides a direct estimate of the mean mass for clusters 
with some observable property O and therefore an impor- 
tant constraint on the probability distribution P(0\M; z) 
needed to connect cosmological theory with cluster ob- 
servations. Cross-correlation lensing consists of stacking 
the weak lensing signal from a large number of objects, 
selected by some property O, to measure the average 
shear profile with high signal-to-noise. By combining 
the signal from many lenses, the error on the mean shear 
profile and on the inferred mean mass can in principle 
be reduced to the sub-percent level; in that limit, sys- 
tematic errors of interpretation start to dominate. Since 
less massive objects are more abundant in the Universe, 
cross-correlation lensing can be used over a very wide 
range of lens masses — from massive clusters down to 
;alaxies, where it i s referred to as galaxy-galaxy lensing 
Tvson et all Il984t iBrainerd et aTTll996l; [Fischer et al.l 
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Luppino & Kaiser! 


1997; 
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20001: Preens et al.l 


2002; 


Cvpriano et al. 2004: Bradac el 


, al.ll2006l). However, in- 



dividual weak lensing cluster mass estimates inferred 
from shear measurements are s ubject to ~ 20% un- 
certainties (jMetzler et al.1 [19991 127)011 iHoekstral 120031 : 



120001 : ISheldon et al.ll200llMandelbaum et al.ll2006T ). Be- 
cause the method corresponds to a statistical measure- 
ment of the lens-mass cross-correlation function (see $3]), 
the inferred mean masses are insensitive to uncorrelated 
mass along the line of sight to the source galaxies. For 
cluster-scale lenses, the mean effects of correlated mass 
along the line of sight, e.g., in neighboring clusters or 
filaments, are generally negligible out to scales compa- 
rable to the cluster virial radius. Moreover, their effects 
can be measured and modeled, as we show in 21 As a 
result, cross-correlation lensing is essentially free of the 
projection effects that plague individual cluster lens mass 

esti mates. 

In ISheldon et al.l (|2007f ) (hereafter Paper I) , we pre- 
sented average shear profiles from cross-correlation weak 
lensing measurements around ~ 130,000 clusters of 
galaxies from the Sloan Digital Sky Survey (SDSS, 
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lYork et all [2000D . These clusters were selected from 
the ma xBCG cluster catalog described in lKoester et al.l 
(|2007bf ): the maxBCG cluster finding algorithm, based 
on the re d sequence of e a rly-typ e cluster galaxies, is de- 
scribed in lKoester et al.l (|2007aj ). 

In this paper, we analyze the detected lensing signal 
presented in Paper I and model the features seen in the 
shear profiles. In §2] we summarize the relevant results 
from Pape r I. In §51 we apply t he non-parametric in- 
versions of I Johnston et ahl (|2007f ) to infer the mean 3D 
cluster mass density and aperture mass profiles in bins 
of optical richness and luminosity (sec £12. ip . These in- 
verted density and mass profiles, however, cannot be di- 
rectly interpreted as profiles of dark matter halos. In 
§2 wc discuss why this is so and develop a parameter- 
ized model which includes the effects of: displacement 
of the center of the cluster halo from the brightest clus- 
ter galaxy (BCG); non-linear shear corrections; lensing 
by the central BCG; and lensing by neighboring clusters 
and structures. When these effects are included, we find 
that the inferred halo profiles arc well fit by the uni- 
yersal dark matter p rofiles of Navarro, Frenk & White 
([Navarro et al.l[X997h . In the context of this model, we 
estimate the average halo virial mass, M200, as a function 
of cluster galaxy richness and total galaxy luminosity. 
We infer the mean halo concentration and halo bias as a 
function of A'^oo and find them to be in good agreement 
with the predictions of N-body simulations for the stan- 
dard LCDM cosmology. In §£] we compare the inferred 
mean halo masses vs. galaxy richness to recent dynam- 
ical mass estimates from m easured velocity di spersions 
for the same cluster sample ([Becker et al.ll2007t) ; the two 
mass estimates agree very well, with the lensing estimates 
having smaller errors. We conclude by discussing some 
cosmological applications of these results as well as ap- 
plications in future optical surveys. 

For computing distances and, where needed, the lin- 
ear power spectrum of density perturbations, wc use 
a spatially flat cosmological model with a cosmologi- 
cal constant and cold dark matter (LCDM) with scaled 
CDM density m = 0.27, baryon density Q, b = 0.045, 
scaled Hubble parameter h = 0.71 (for the linear power 
spectrum not distances) and primordial spectral index 
n s = 0.95. The linear power spectrum amplitude erg is 
left free except where specified. We e mploy the linear 
trans fer function of Eiscnstein and Hu ([Eisenstein fc Hul 
119981 ). This mo del (with g 8 = 0.8) fits both the WMAP 
third-year data lSpergel et all (|2007h an d the SDSS lumi- 
nous red galaxy (LRG) clustering data ([Eisenstein et al.l 
|2005[ ). All distances in this paper are in physical not 
comoving units of h~ 1 Mpc. 

2. WEAK LENSING SHEAR MEASUREMENTS 

The methods of measuring the weak lensing signal 
are described in detail in Paper I. Wc briefly summa- 
rize some of the important features here. For any pro- 
jected mass distribution, the azimuthally averaged tan- 
gential shear at projected radius R from the center of 
the distribution is given by 7(i?) = A£(i?)/£ cri t = 
[E(< R) - S(i?)]/S crit , where E(i?) is the 2D pro- 
jected mass density at radius i?, £(< R) is the aver- 
age of £ inside a disk of radius R, S(i?) is the az- 
imuthal average of X(-R) in a thin annulus of radius 
i?, and the critical density for strong lensing is given 



TABLE 1 
12 N200 bins 



Bin number 


^200 


Number of clusters per bin 


1 


3 


58788 


2 


4 


27083 


3 


5 


14925 


1 


6 


8744 


5 


7 


5630 


6 


8 


3858 


7 


9-11 


6196 


8 


12-17 


4427 





18-25 


1711 


10 


26-40 


787 


11 


41-70 


272 


12 


71-220 


47 



Note. — The catalog is divided into 12 -/V200 rich- 
ness bins. This table shows the boundaries of N200 
values and the number of clusters for each bin. 



by E c „ t = c 2 /(47rG) D s /(D l D ls ), with D Sl D Ll D LS 
the angular diameter distances from the observer to the 
source, to the lens, and between the lens and source, 
respectively. These distances are cosmology-dependent 
functions of redshift. Paper I presents average profiles 
of A£(i?) for maxBCG clusters binned by cluster galaxy 
number, A^oo; an d by optical luminosity £200- For these 
measurements, the radius R is defined with respect to 
the position of the BCG; see §4.31 for further discussion 
of this point. 

2.1. Richness and Luminosity measures N200 o/nd £200 

Although the richness and luminosity measures -/V200 
and L200 are discussed in detail in Paper I, here we em- 
phasize some of their important features to avoid possible 
confusion. and L200 are the galaxy number and to- 

tal z-band luminosity measured within a projected radius 
we call r2oo\ in both cases counting only red-sequence 
galaxies with luminosities large r than 0.4L^ and satis fy- 
ing other selection criteria (see iKoester et al.l l2007al for 
details). This radius is not by definition, equivalent to 
the r2oo defined by the mass (Eqn. [3J, which can in prin- 
ciple be measured directly from lensing, since r2oo is not 
known prior to performing the weak lensing analysis. In- 
stead, Tjoq 3 is determined by first measuring the number 
of galaxies, N ga i, within a fixed 1 h~ l Mpc aperture and 
calculating t^qq = 0.156 h~ Y Mpc, as discussed 

in lHansen et al.l (|2005l ). Nevertheless, we find that r 9 ^ 
is in fact a good approximation to r2oo as determined 
in this paper from the lensing data to within about 5%. 
The mass-to-light ratio as a function of radius will be pre- 
sented in Paper III of this series (Sheldon et al. 2007). 
Note that ./V200 is dimcnsionless, and L200 has units of 
10 10 /i- 2 L o . 

For the purpose of lensing measurement, the catalog 
is subdivided into 12 -/V200 richness bins and 16 L200 
richness bins. The richness boundaries for each richness 
measure as well as the number of clusters per bin are 
displayed in Tables [T] and [2] 

3. INVERTING CLUSTER PROFILES 

3.1. Inversion Method 

The methods used to invert the lensing AS(i?) pro- 
files to 3 D density and mass profiles are discussed in 
detail in iJohnston et alj (|2007f) and were first used by 
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Fig. 1. — Left: Weak lcnsing profiles AE(ii) for 12 bins of optical richness, N200- Right: AE(i?) for 16 j-band luminosity bins, L200- 



TABLE 2 

16 L200 BINS 



Bin number 


L 2 oo(lO lo /i- 2 L ) 


Number of clusters per bin 


1 


5 - 6.24 


19618 


2 


6.24 - 7.8 


18597 


3 


7.8 - 9.74 


16042 


4 


9.74 - 12.2 


12269 


5 


12.2- 15.2 


9010 


6 


15.2 - 19.0 


6152 


7 


19.0 - 23.7 


4164 


8 


23.7 - 29.6 


2666 


9 


29.6 - 36.9 


1703 


10 


36.9 - 46.1 


1042 


11 


46.1 - 57.6 


638 


12 


57.6 - 71.9 


344 


13 


71.9 - 89.8 


210 


14 


89.8 - 112.1 


108 


15 


112.1 - 140 


19 


16 


140 - 450 


46 



Note. — The catalog is also divided into 16 L200 richness 
bins. This table shows the boundaries of L200 values and the 
number of clusters for each bin. 



I Sheldon et alJ (|2004f ) to obtain the galaxy-mass correla- 
tion function from galaxy-galaxy lcnsing measurements. 
Here, we provide a brief overview of the methods. 

The mean excess 3D density profile Ap(r) around a 
set of clusters with a given observable O (e.g., rich- 
ness or luminosity) is best thought of in terms of the 
cluster-mass two-point correlation function, £ cm , since 
Ap(r) = p ( cm (r), where p is the mean density of the Uni- 
verse. By the assumptions of spatial homogeneity and 
isotropy, £ cm depends only on the magnitude of the sep- 
aration, r, not on direction. As a consequence, the mean 
density profile Ap(r) should be very nearly spherically 
symmetric. Note that this is a purely statistical state- 
ment: we do not assume that individual cluster density 



profiles are spherically symmetric. The spherical sym- 
metry of the average density profile enables the inversion 
of the stacked lensing signal AE(i?) to the 3D density 
Ap(R) and the aperture mass M(R). By contrast, weak 
lensing measurements of individual clusters can only be 
used to reconstruct the projected 2D mass density, S(x), 
since lensing is produced by all of the mass projected 
along the line of sight. 

The mean 3D density profile is obtained as an integral 
of the derivative of the shear profile AS(i?) through a 
purely geometric relation, 



1 f°° 
Ap(r) = - dR 

7T J r 



VR 2 - r 2 



(1) 



where a prime denotes a derivative with respect to R. 
The lensing data AE enters here since it can be shown 
that 

-£'(#) = AE'(i?) + 2AE(i?)/i? . (2) 

The 3D mass profile is given in terms of AT,(R) and 
Ap(R) as 



M(R) = irR 2 AY}(R) + 2ir dr r Ap(r) x 



Rr 



(r - Vr 2 - i? 2 ) 



(3) 



2 ( r - Vr 2 - R 2 

_Vr 2 -R 2 

In practice, these integrals must be truncated at some 
maximum radius, R m ax, the largest scale at which one 
has lensing data (30/i _1 Mpc for our data). The uncer- 
tainty from this truncation is related to the mass-sheet 
degeneracy. Due to the steepness of the cluster profiles 
we infer in this paper, this truncation creates only a few 
percent uncertainty in the last few radial bins of both 
density or mass and virtually none in bins at smaller 
radii. Complete deta ils of the procedure are given in 
Uohnston etall (|2007t ). 
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3.2. 3D Density and Mass Profiles 

The inverted 3D density profiles for each of the 12 A^oo 
richness and 16 L200 luminosity bins are presented in Fig- 
ure [21 These profiles are noisier than the shear profiles, 
since they involve derivatives of noisy data. The differen- 
tiation in Eqn. [T] also leads to anti-correlations between 
neighboring radial bins of Ap(r). 

Figure [3] shows the inverted mean aperture mass pro- 
files, M (r), for the same richness and luminosity bins as 
above. Since the mass profile is an integral of the density 
profile, it is smoother than the latter, and neighboring 
bins of M(r) are statistically correlated. This allows one 
to better see the deviations from power-law behavior that 
one expects from the halo model (see 

3.3. Direct Measurements of r2oo and A/200 

The radius r2oo is defined, herein, as the radius within 
which the average density is 200 times the critical density 
p c . This defines the corresponding mass scale, 

M 200 = M(r 200 ) = 200 Pc {z) - it r 3 200 , (4) 

where p c {z) = 3H 2 (z) / (8ttG) is the critical density at 
epoch z, and the Hubble parameter satisfies H 2 (z) = 
H% [O m (l + z) 3 + (1 - fi m )] for a flat LCDM Universe. 
Throughout this paper we use z = 0.25, the mean cluster 
rcdshift for our sample, and TO = 0.27 to compute p c (z). 
For these choices, the conversion between M200 and r2oo 
is A/200 = 2.923 x 10 14 H^Mq (raoo/^Mpc) 3 . 

Using the inverted mass profiles shown in Fig. [3j 
we can determine r 2 oo and M200 in a model inde- 
pendent way, by simply measuring where the curve 
200 p c (z) 4/3 7r r 3 crosses the mass profile. This pro- 
cedure, which is illustrated in Fig. [3J requires one to 
interpolate the data between the two radii closest to 
the crossing point. This interpolation can in principle 
be ill-defined if the data are noisy and the profiles non- 
monotonic, but that never occurs for any of our profiles. 
We have experimented with a few different ways of in- 
terpolating. One can use the unique power law defined 
by the two neighboring data points or fit a power law to 
a four-point neighborhood of the crossing. We find that 
these methods give essentially identical answers; since 
the four-point method yields slightly lower scatter in the 
mass-richness relation, we use that method. 

While this procedure for inferring r2oo and M200 has 
the advantage of being model independent, the results 
cannot be interpreted as the virial radii and masses of 
the corresponding dark halos. The primary reason is that 
BCGs, which we use to define the center of each cluster 
for the lensing measurements, are not always positioned 
at the center of mass of the underlying dark matter halo. 
This fact, which we observe in our simulations, is not sur- 
prising: for this analysis, clusters are the objects identi- 
fied by the maxBCG algorithm, while dark matter halos 
are theoretical constructs — the two are not in precise 
one-to -one correspondence (jCohn et al.ll2007t iRozo et all 
l2007aj) . The model- independent profiles of Fig. [3l and 
the corresponding values of r2oo and A/200, are the "true" 
mass profiles of clusters centered on their BCGs. How- 
ever, to estimate dark matter halo profiles and masses, 
we must adopt a model to describe the data, which we 
do in the next section. When we do so, we find that the 



TABLE 3 

Direct cluster A/jqo-Richness calibration: 
N200 Bins 



(A^oo) M? 00 ( W^h-iMo ) rf 00 (h- 1 Mpc) 



3.00 


4.26 ± 0.45 


0.24 


± 


0.01 


4.00 


5.29 ± 0.65 


0.26 


± 


0.01 


5.00 


8.01 ± 1.34 


0.30 


± 


0.02 


6.00 


13.15 ± 1.65 


0.36 


± 


0.01 


7.00 


9.66 ± 2.28 


0.32 


± 


0.03 


8.00 


12.71 ± 3.36 


0.35 


± 


0.03 


9.82 


25.53 ± 2.86 


0.44 


± 


0.02 


13.91 


42.31 ± 3.42 


0.53 


± 


0.01 


20.78 


74.45 ± 7.46 


0.63 


± 


0.02 


31.09 


123.22 ± 11.28 


0.75 


± 


0.02 


50.27 


199.26 ± 24.81 


0.88 


± 


0.04 


92.18 


502.87 ± 87.61 


1.20 


± 


0.07 



Note. — The M|q - richness relation for the 
^200 richness bins. This estimate of M200 which 
we call M£qq is meant to represent the M200 of the 
clusters as opposed to the dark matter halos. It is 
estimated non-parametrically by determining where 
the 3D mass profile Mir) cross the line determined 
by4/3?rr 3 200 p cri ±{z). These masses differ from the 
parametric masses that include cluster misccntcring 
and other effects. 



TABLE 4 

Direct cluster Mjqq-richness calibration: 
L200 Bins 



(£200} M&o ( W^h-iMQ ) r* (h- 1 Mpc) 



5.59 


4.59 ± 0.77 





25 


± 





01 


6.97 


5.68 ± 0.81 


0. 


27 


± 





01 


8.69 


6.16 ± 0.96 





28 


± 





01 


10.84 


12.86 ± 1.42 





35 


± 





01 


13.53 


11.98 ± 1.80 





34 


± 





02 


16.89 


22.92 ± 2.89 


0. 


13 


± 





02 


21.06 


30.94 ± 3.60 





47 


± 





02 


26.31 


41.36 ± 4.51 


0. 


52 


± 





02 


32.89 


56.90 ± 7.80 


0. 


58 


± 





03 


40.95 


77.67 ± 9.78 


0. 


61 


± 





03 


51.19 


99.05 ± 13.49 





70 


± 





03 


64.08 


160.65 ± 22.19 


0. 


82 


± 





.04 


79.89 


160.16 ± 30.13 


0. 


82 


± 





05 


98.69 


182.81 ± 35.58 





86 


± 





06 


124.59 


258.49 ± 53.30 


0. 


96 


± 





.07 


184.65 


553.76 ± 93.41 


1 


21 


± 





.07 



Note. — The M|g Q - richness relation for the 
^200 richness bins. 

inferred dark matter halo masses are about 50% higher 
than the model-independent cluster masses. We use the 
results of those model fits to constrain the halo mass - 
richness relations and other scaling relations. 

We will distinguish these two types of masses by re- 
ferring to the parametric halo masses as A/200 and non- 
parametric cluster masses as A/^qq. For completeness, 
we present these cluster masses in Tables [3] and [4] but 
we will not use them elsewhere in this work. In another 
publication, Paper III of this series on cluster mass-to- 
light ratios (Sheldon et al.), we will refer to these non- 
parametric A/|q masses. 

4. HALO MODEL FITS TO LENSING PROFILES 

To proceed, wc construct a physical model of the av- 
erage mass density in clusters that comprises three com- 
ponents: the central BCG, the cluster-scale dark matter 
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Fig. 2. — Left: Inverted mean density profiles, Ap(r), for the 12 ./V200 richness bins shown in Fig. [T] Right: Inverted Ap(r) profiles for 
the 16 L200 richness bins shown in Fig. [T] 
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Fig. 3. — Left: Inverted 3D aperture mass profiles, M(r), for the 12 ./V200 richness bins. The dotted blue diagonal line in each panel 
denotes 200 p c 4/3 tt r 3 (see Eqn. |4}; this crosses the mass profile at r200 and M20O1 which are indicated with the dashed red vertical and 
horizontal lines. Right: Inverted 3D aperture mass profiles, M(r), for the 16 -L200 richness bins. 
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halo in which it sits, and neighboring mass concentra- 
tions. We will also consider non-linear shear. We treat 
these in turn. 

4.1. The BCG 

Since every maxBCG cluster, by design, is centered on 
a bright galaxy, we should allow for a contribution to 
the mass from the baryons (mainly stars) and from the 
dark matter sub-halo of the BCG (assuming the latter 
is not modeled by the cen tral cusp of the cluster-scale 
halo). iGavazzi et al.l (|2007| ) find that a central baryonic 
component is required to fit both the strong and weak 
lcnsing profiles of early-type galaxies in the SLACS sur- 
vey. Although this contribution could be modeled in a 
number of ways, e.g., by using a de Vaucouleurs profile, 
its effects are only significant on very small scales, and its 
form is not well constrained by our data. Therefore, we 
simply model this contribution as a central point mass, 
Mo, with lensing signal AE = Mq/ (ttR 2 ), where Mo is a 
model parameter to be fit. 

4.2. The cluster dark matter halo 

Out to radii of a few Mpc, the density profiles appear 
to be dominated by the cluster-scale dark matter halos. 
N-body simulations of structure formation with cold dark 
matter indicate that halos are re asonably well modele d 
by the universal (NFW) profiles of lNavarro et al.l (|1997h , 



averaged S(i?) profile is given by the convolution 



PNF\v(r) 



8 p c (z) 



(r/r s )(l + r/r s 



(5) 



This form contains two free parameters, a scale radius 
r s and an amplitude <5; p c {z) is the critical density at 
rcdshift z. At r ~ r 8 , the logarithmic slope of the NFW 
profile changes between the asymptotic values of —1 at 
small scales (r -C r s ) and —3 at large scales (r 3> r s ). 
The parameters 5 and r s are usually traded for a descrip- 
tion in terms of r2oo (or equivalently M200) and 0200- As 
above, r2oo is the radius within which the mean den- 
sity is 200 times the critical density and for which the 
enclosed mass is M(r 2 oo) = 200p c (z)(4/3)7r r| 00 , while 
c 2 oo = ^200 / r s is the concentration. The amplitude 6 can 
be expressed in terms of C200 as 



200 



-200 



3 ln(l + C200) - c 2 oo/(l +C200) 



(6) 



Analytic expressions for the shear profile 
A£( i?; cgon, M 2 no) of NEW h alos can be found in, 
e.g., IWright fc Brainerdl (|2000l ). Various other defini- 
tions of the virial radius have been used in the literature, 
e.g., the radius within which the mean density is 180 p(z) 
instead of 200p c (z). We discuss the conversion among 
these different systems in the Appendix. 

4.3. Miscentering of the BCG and the Halo 

For the lensing measurements, the center of each clus- 
ter (R = 0) is defined to be the position of the BCG iden- 
tified by the cluster-finding algorithm. As noted in i)3.31 
some fraction of the BCGs may be offset from the centers 
of the corresponding dark matter halos. Such "miscen- 
tering" changes the observed tangential shear profile. If 
the 2D offset in the lens plane is R s then the azimuthally 



cos(0) 



1 r^t , , 

T,(R\R S ) = — dOT, ( yjR? + R 2 S +2R R s 
2tt Jo V 

, , (7) 

(|Yang et al J 1200^ 1. 

To make progress, we need to know something about 
the distribution of offsets, P(R S ). In order to estimate 
this, we employ N-body simulation-based mock galaxy 
catalogs that have been constrained to have realistic lu- 
minosities, colors, clustering properties, and cluster pop- 
ulations. These catalo gs, which have been used in pre- 
vious m axBCG studies (jKoester et al.ll2007at iRozo et "all 
l2007al |bl). populate a dark matter simulation with galax- 
ies using the ADDGALS technique (Wechsler et al 2007). 
The catalog is based on the light-cone fr om the Hub- 
ble Volume simulation (jEvrard et alJl2002T ). and extends 
from < z < 0.34. Galaxies are assigned directly to dark 
matter particles in the simulation, with a luminosity- 
dependent bias scheme that is tuned to match local clus- 
tering data. The galaxy luminosities are first assigned 
in the 01 r-band , draw n from the luminosity function 
of iBlanton et al.l ([2003). The luminosity function is as- 
sumed to evolve with Q = 1.3 magnitudes per unit red- 
shift. We first constrain the relationship between galaxy 
luminosity and Lagrangian matter densities on a scale of 
~ M*, using the lumino sity-dependent two -point clus- 
tering of SDSS galaxies (|Zehavi et al.l l2005h . For each 
galaxy, a dark matter particle is then chosen on the ba- 
sis of this density with some P(S\M r ). Each mock galaxy 
is then assigned to a real SDSS galaxy that has approx- 
imately the same luminosity and local galaxy density 
measured here as the distance to the fifth nearest neigh- 
bor. The color for each mock is then given by the SED 
of this matched galaxy transformed to the appropriate 
rcdshift. Because BCGs are now known to be distinct 
from the general galaxy population, BCG properties are 
further tuned to match the luminosities and colors of ob- 
served BCGs; in addition a BCG is placed at the center 
of each dark matter halo. This procedure produces a 
catalog which matches several statistics of the observed 
SDSS population, including the location, width and evo- 
lution of the ridgeline, which makes it ideal for testing 
the maxBCG algorithm. In this work, we use five galaxy 
realizations that have been run using the same underly- 
ing dark matter simulation; to improve our statistics, we 
merge all five mock catalogs into one. 

The maxBCG algorithm is then used to identify clus- 
ters in the mock catalogs, and the resulting BCG posi- 
tions can be compared to the centers of mass of the dark 
matter halos in the input N-body s imulations. We use 
the matching technique described in IRozo et "all (|2007af ) 
to match clusters to halos, and directly compute the off- 
set R s between the halo center and the BCG assigned 
to the halo by the maxBCG cluster finding algorithm. 
In the real Universe, miscentering for our cluster pop- 
ulation can occur for either of two reasons — the real 
BCG can be offset from the center of mass, or the BCG 
can be misidenfied by the cluster finder. In the mock 
catalogs, there is always a bright galaxy at the center 
of the dark matter halo, so we arc neglecting here the 
first case. Although this is not likely to be precisely true 
in all cases, our results indicate that miscentering due 
to misidentified BCGs dominates the effects we discuss 
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below. 

For these catalogs, a richness-dependent fraction of the 
BCGs appear to be accurately centered on their dark 
matter halos (R s ~ 0), while the rest are reasonably well 
described by a 2D Gaussian distribution, 

P(R s ) = ^eM~(Rs/<T s ) 2 ) (8) 

with a s — 0.42 ft, -1 Mpc, independent of cluster rich- 
ness (see next section). The resulting mean surface mass 
profile for the miscentered clusters can be written 

^nfw(R) = J dR s P(R s ) H NFW (R\R S ) (9) 

and AT, S NFW (R) = Y; s nfw{< R) — ^ s nfw (R). We find 
that the mean shear profile is not very sensitive to the 
shape of the distribution of R s , but it is sensitive to the 
effective scale length <r s . 

Figure 2] shows the effects of such miscentering on the 
lensing signal for a cluster with an NFW profile. The 
effect on A£(i£) is much larger than on £(i£): the con- 
volution in Eqn. [7] leads to a smoothing which essen- 
tially flattens the S s (i?) profile at small scales, creat- 
ing a mass sheet which causes little shear. While the 
AT,nfw(R) profile is relatively flat at small scales, the 
smoothed A'E S NFW (R) profile is strongly suppressed at 
scales R < 2.5<r s . 

In applying this model to the data in $5j we include 
ln(<7 s ) as a model parameter, using its value from the 
mock catalogs as the central value of a Gaussian prior 
probability distribution. We assume that a fraction p c 
of the BCGs are accurately centered on the dark matter 
halos, and that a fraction 1 — p c follow the distribution 
of Eqn. [51 The simulations are used to formulate a prior 
distribution for p c , as described in §4.51 

We determine this fraction p c of correctly centered 
BCGs as a function of A^oo! this is shown in the left panel 
of Figure El We can model this relation as p c {N2oo) = 
1/(1 + exp(-g)) with 

q = ln(1.13 + 0.92 (A 200 /20)). (10) 

The dotted lines show the statistical 95% confidence 
bands recovered in the simulations, whereas the dashed 
lines show the 95% bands corresponding to the much 
more generous 0.4 prior on q used in our analysis as de- 
scribed in §4.51 The right panel of Figure [5] shows the 
miscentering distribution P(R S ). The data from the sim- 
ulations is roughly fit by a two dimensional Gaussian of 
width <tr = 0.42 hr 1 Mpc. Note that because the mock 
catalogs place the BCG of a halo at the center of the 
halo, the offset R s is identically zero if maxBCG assigns 
the correct BCG to each cluster. 

Our best fit model is shown MS M solid line, while the 
dashed lines show the models that bound the 68% con- 
fidence regions corresponding to the 30% Gaussian prior 
on the parameter as used in §4.51 to fit the data. It is 
clear that our adopted priors are much more generous 
than the statistical noise in the simulations. We choose 
this wider prior since there may be differences between 
the mock catalogs and the real data. The wider prior 
likely can mostly account for real offsets between BCGs 
and the center of the mass concentration. Finally, we 
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Fig. 4. — Effect of an offset between the BCG and the halo center 
on the projected mass profile E(-R) and the lensing signal A£(i?,). 
The black solid curve shows the E(-R) profile for an NFW halo with 
C200 = 5 and T200 = 1 h~ l Mpc. The black dashed curve shows the 
corresponding AE(R) profile. The red curves show the resulting 
mean profiles when the distribution of randomly-oriented BCG- 
halo offsets is a 2D Gaussian with dispersion a a = 0A2h~ 1 Mpc 
(indicated by the blue vertical line). The red solid curve shows the 
smoothed T, S (R) and the red dashed curve the smoothed AT, S (R) 
profile. Miscentering has the effect of making the S s (i?) nearly 
flat, i.e., a mass sheet, at small scales. Although S(-R) and T, a (R) 
differ by only 10 — 30% near r = cr B , AS and AS S differ by an order 
of magnitude. For this example, AS"(_R) peaks at r ~ 2.5<r s ; this 
behavior depends slightly on C200- 



emphasize here that we are adopting the same miscen- 
tering distribution for all richness bins. The differences 
between the various richness bins in the mock data are 
much smaller than the 30% prior that we use. 

4.4. Neighboring mass concentrations 

The NFW profile is expected to be a good represen- 
tation of the stacked mass profiles on small to interme- 
diate scales surrounding clusters, but on large scales the 
lensing signal is dominated by neighboring mass concen- 
trations, e.g., nearby halos and filaments. We model this 
contribution via the so-ca lled two-halo term (|Seliakll2000l ; 
iMandelbaum et al.ll2005h . 

P2h(r) = b(M W0 , z) Q m p cfi (1 + zf £(r, z) (11) 

where p Ci o is the critical density at the present epoch, 
and £;(r, z) is the auto-correlation function of the mass 
in linear perturbation theory, evaluated at the redshift 
of the clusters. Here, 6(M200j z ) is the linear bias pa- 
rameter for dark matter halos, which h as a predicted de- 
pendence upon halo mass and redshift (|Sheth fc Tormenl 
ll99llSeliak fc Warrenl l2?K)4bh . 

The shape of the linear correlation function is deter- 
mined by the cosmological parameters n s , h, and f2 m for 
a flat LCDM model and is constrained by observations 
of ga laxy clustering (jEisenstein et al.l 2005; Zehavi et al.l 
2004). The linear correlation function can be expressed 
as 

^(r,z)=D(z) 2 a 2 8 Ci((l + z)r) , (12) 

where £;(r) with a single argument is the linear corre- 
lation function evaluated at z = and normalized to 
<7g = 1. The presence of the factor of (1 + z) in the above 



CROSS-CORRELATION CLUSTER LENSING IN THE SDSS II 



9 




2_ 



10 100 0.0 0.5 1.0 1.5 2.0 

N 200 R s (rT 1 Mpc) 



Fig. 5. — Left: The probability that a cluster is correctly centered as a function of cluster richness, N200, in mock catalogs. The 
diamonds with error bars are the measurements in the simulations, whereas the solid line is our best fit model (see text for details). The 
dotted lines show the 95% confidence (2cr) band from statistical uncertainties only. The dashed line shows the more generous 95% confidence 
region corresponding to the adopted 0.40 prior uncertainty on q, which are wider to allow for some possibility that there are differences in 
the probability of a cluster being correctly centered between our mocks and the real data. Right: The distribution of the projected radial 
offsets between a halos and clusters which are not correctly centered. Diamonds with error bars show the measurement in the mocks, while 
the solid line represents the best 2D Gaussian model, corresponding to a width erg = 0.42 h^ 1 Mpc. The dashed lines are the two models 
that bound the much more generous 68% confidence region of erg with the adopted 30% prior on <rg. 



expression converts the physical distance r into comov- 
ing units. All distances in this paper are in physical not 
comoving units. The linear growth factor satisfies 



D(a) oc H(a) / da' [H{a')a'} 



J r ui „'\ „'i — ^ 



(13) 



with a = 1/(1+2); D is normalized to unity at a = 1 (z = 
0). We can therefore express the two- halo contribution 
to the density as 



P2h{r) = B Pcfi (1 + zf £,((1 +z)r), (14) 
where we have defined an effective bias parameter 

B = b{M 200 , z) n m a\ D{zf . (15) 

The contribution of the two-halo term to the lensing 
signal, for fixed values of n s , h, and Q m , can be written 
as AE(i?; B) = BAE;, where, as before, AE ( (i?) = %{< 
R)-%(R), and 

E,(i?) = (1 + zf Pcfi J dyfr ((1 + z)Vy 2 + R 2 ) 

= (l + z) 2 p c , W((l + z)R) (16) 



with 



W(R) = J dyMv* + FP) . 



(17) 



4.5. Summary of halo model and parameter priors for 

AS fits 

Combining the results from sections 14.11 through 14.41 
we can write down the model for the lensing signal AE 
thus far, 



AE(i?) 

TTi? 2 



p c AX NFW (R) + (1 - Pc )AZ s NFW {R) + BAE, 

(18) 



where, sequentially, the terms come from the BCGs, the 
halos centered on the BCGs, the halos not centered on 
the BCGs, and the neighboring halos. 

There are two further effects to consider. This model 
assumes a constant halo mass where, in reality, the sig- 
nal will be averaged over the distribution of halos masses 
for each richness bin. The other effect that we will con- 
sider is the non-linear sh ear effect that is discussed in 
iMandelbaum et all (|2OO60 . We will treat this non-linear 
contribution first and then integrate the full signal over 
the distribution of masses. 

The average tangential cllipticities do not trace the 
shear exactly but rather trace the reduced shear, g = 
7/(1 — k). Let e.y be the i-th source galaxy around clus- 
ter 7 f or some radial bin. As shown in Mandclbaum ct al. 
(|2006f ) an estimator for AE formed from a weighted aver- 
age of ellipticities and identical halos has a second order 
contribution 



with 



AE = ^W ij ey 

ij 

= AE + AE E C z 



C 



(19) 
(20) 



This differs fr om the (E^t) / (^cr\t) in 
IMandelbaum et all (|2006l ) in that our weighting 
has an explicit factor of E~^. 



Wi.. 



-2v-l 

ij cri\ 



(i,3) 



(21) 



where R is the shear responsivity and crj^ are the esti- 
mates of variances on source ellipticities. 

Using the photometric redshifts for the source galaxies 
and the maxBCG photometric estimates for the cluster 
redshifts, we find C z = 1.40 x 10~ 4 h~ l pc 2 /M . This 
quantity varies only a few percent across different cluster 
samples and different radial bins; a variation we ignore. 
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For the last step we need to consider that within any 
richness bin, there will be scatter in mass. So we need to 
integrate this expression over the probability distribution 
of halos masses P(M2oo)- Here, ( ) indicates averaging 
over P(A/ 200 ). 



(AS) = (AS) + (AS S) Z z 



(22) 



We will use a log-normal distribution of M 2 oo at fixed 
richness with a variance in In Af 2 oo given by Vm which is 
our last model parameter. 

For the first term (AS) we can integrate Eqn. IT51 
over P(M 2 oo). Corresponding to Eqn. [TH there is a 
three-term expression for S (the point-mass doesn't con- 
tribute). So our second order correction has 12 terms 
that need to be integrated over P{M 2 oo)- Most of these 
pairs (i, j) do not contribute since (ASj Sj) (R) Cz <C 
(AS) (R) at all scales. 

Only two of these terms make meaningful contributions 
at the smallest scales, 



AY, NL = C z x 

PcMq 

ttR 2 



J NFW I 



(23) 



With this last expression we can write down our full 
model for our data where, again, ( ) indicates averaging 
over P(A/ 200 ). 



< AS(i?) > 
Mo 



Pc (AS 



NFW 



ttR 2 

(l-p c ) (AE% FW 



(R) 

(R) + i?AS; + ASjvl 



(24) 

This model has seven parameters: the BCG point mass 
Mo; the two NFW halo parameters r 20 o and c 2 oo! the 
scatter in the mass-richness relation Vm', the halo mis- 
centering width cr s and the halo centering fraction p c ; the 
linear bias amplitude B = b(M2oo)Q m <T%D 2 (z) which 
should also be thought of as the average over the mass 
distribution. 

Since we will integrate over P(M 2 oo) we need to be spe- 
cific about what we mean by our parameter r 20 o ■ We take 
this to be the r 2 oo corresponding to the average M 2 oo- 
For a log-normal distribution < Af 20 o >= exp(VM/2 + ^) 
w here p, is the averag e ln(Af 2 oo)- 

iBecker et all (|2007t i measures the variance of the log- 
arithm of the galaxy velocity dispersion as Var(a v ) = 
0.0963 - 0.0241 (iV 200 /25) and lEvrard et all (f2007h de- 
termines the scaling M 20 o oc with A — 2.98. This 
results in 



V M = 0.855 - 0.214 ln(iV 200 /25). 



(25) 



We allow for an uncertainty of 0.60 in Vm in our prior 
(i.e. a 30% uncertainty for the scatter). This log-normal 
model also seems consistent with our mock catalogs. 

Table [5] lists all seven model parameters used in the 
fits, including the information about the prior distribu- 
tions. To enforce positivity, logarithms are used for all 
parameters except B and Afo- Each prior distribution is 
taken to be a Gaussian with mean and standard devia- 
tion as indicated in the table. In addition Mq is forced to 



be positive since it is not at all constrained on the lower 
end. Here, a "weak prior" means that neither the best- 
fit parameters nor the estimated parameter errors change 
significantly if the standard deviation of the prior distri- 
bution is increased. Since the parameter p c is constrained 
to lie in the range [0, 1], we use the transformed param- 
eter q = ln[p c /(l — p c )] which has range [—00, +00] and 
can thus be assigned a Gaussian prior. The prior mean 
of q and Vm vary with richness as described in Eqn. 1101 
and [25] 

To fit the measured AS profiles with the model, we use 
a Markov Chain Monte Carlo (MCMC). MCMC is useful 
for efficiently calculating likelihoods in multi-dimensional 
parameter spaces. MCMC methods generate "chains" or 
sequences in the parameter space that represent a fair 
sampling from the full posterior probability distribution. 
Thus, they allow one to visualize the likelihood surface 
and see degeneracies between parameters without assum- 
ing that the errors are normally distributed (as in the 
Fisher matrix method). It is also straightforward to in- 
clude priors on parameters in the MCMC approach. Our 
MCMC routine uses the Metropolis-Hastings algorithm 
with Gaussian transition functions. The total number of 
steps is 100,000, and we discard a burn-in period of the 
first 1000 steps. Runs of varying length show that con- 
vergence of the posterior distribution is reached before 
10,000 steps and longer runs such as our 100,000 step 
run improve the sampling but do not affect the sample 
mean or variances by meaningful amounts. 

4.6. Systematic errors 

There are two major sources of systematic error in any 
weak lensing measurements: shear calibration error and 
errors associated with photo- z biases. The shear calibra- 
tion error of the shear estimation methods that we use 
for the SDSS were t ested as part of the Shear TEsting 
Program II (STEP2 ; iMassev et all (|2007l )) and found to 
be less than a percent (for the RM method in STEP2). 
However, we allow for a 3% error in shear calibration 
since the STEP simulation error may not represent the 
full calibration error when these methods are applied to 
real data — e.g., the PSF modeling of SDSS was not 
tested in STEP. 

The dominant systematic error is that associated with 
biases in the photometric redshift distribution. We use a 
neural network based method (Cunha et al. In prepara- 
tion) which uses a training set of spectrosco pic redshifts 
from the SPSS, CNOC2 (|Yee et all l200lh and CFRS 
(jLillv et al.lll995l ). See Paper I for details. 

Although it is difficult to estimate the residual photo- z 
bias, we will assume that the amplitude of the resultant 
masses is uncertain at the level of 7% and we will include 
this in our errors of the zero-point of the mass-richness 
relation. Further improvements in photo- z calibration 
should be able to reduce this overall error by as much as 
a few percent. 

Another source of systematic errors is model depen- 
dency. The priors that we have chosen arc considered to 
be independent between richness bins and so combining 
12 to 16 bins of data reduces the effective width of these 
priors by about 4 when considering averaged quantities 
such as the mass-richness relation (see §5.ip . However if 
the prior means for these quantities such as q and Vm are 
shifted systematically from their true values, the effect of 



CROSS-CORRELATION CLUSTER LENSING IN THE SDSS II 



11 



TABLE 5 

Halo Model Parameters for AS fits 



Param # 


Parameter 


Description 


Prior-mean 


Prior-sigma 


Note 


1 


Mr20o) 


T200 radius 


-0.693 


1.5 


weak prior 


2 


ln(c 2 oo) 


concentration 


1.386 


3 


weak prior 


3 


B 


bias amplitude 


0.5 


4.0 


weak prior 


1 


Q 


miscentering parameter 


sec text 


0.1 


strong prior 


5 


ln(<T S ) 


miscentcring width 


-0.868 


0.3 


strong prior 


6 


M 


point mass (10 12 h~ 1 M & ) 





2.5 


weak prior 


7 


v M 


variance of ln(Af20o) 


sec text 


0.6 


strong prior 



Note. — Parameters in the model (Eqn. 1241 ) for AS. The mean and standard deviation for 
the Gaussian prior distribution are given as well as a brief description. 



these maginalizations may not fully account for this. By 
experimenting with different values for these prior means 
we can estimate the possible level of additional system- 
atic errors. For the mass-richness relation we estimate 
that this will contribute an additional systematic error of 
10%. The concentration, C200, is more affected by shifts 
in these nuisance parameters, particularly q and Mq. We 
allow for a 30% systematic error on the amplitude of the 
C2oo - -^20o relation §5.21 The bias parameter, B is less 
affected by these nuisance parameters but as we shall sec 
in t)5.31 some knowledge of Vm is required to compare it 
with theoretical predictions. 

5. RESULTS OF HALO MODEL FITS 

Figure [6] shows the result of an MCMC run for the sev- 
enth M200 richness bin. These are the one-dimensional 
marginal posteriors for the 7 parameters. Most resemble 
a normal distribution with the exception of Mq which is 
constrained to be strictly positive. The red lines indicate 
the prior normal distribution for each. The first three 
In i?200 1 l n C200 and B have uninformative priors whereas 
q, In as and InVjvf have constraining priors. That is, in 
the later case, the posterior resembles the prior; the data 
arc uninformative for these three. 

Figure [7] shows the marginal posterior distributions for 
all 21 pairs of parameters for the same bin. The red re- 
gion is the 68% (1 a) confidence region; green is the 95% 
(2 a) confidence region and blue is the 99% (3 a) confi- 
dence region. Although none of these parameters appear 
to be strongly degenerate at these noise levels, there is 
some correlation. i?2oo is correlated with both q and Vm 
and C200 is correlated with q and Mq. These contours 
also allow for an estimate of how the best fit parameters 
might be biased if we have systematically misestimated 
our nuisance parameter priors. If the shot noise were 
significantly smaller, these correlations with nuisance pa- 
rameters would become more dominant sources of error, 
and so modeling the effect of these (and possibly other) 
parameters will become a more critical issue for future 
experiments. 

The results of fitting this model to the AS profiles 
in the 12 A^oo richness and 16 £200 luminosity bins are 
shown in Figures [8] and [9] In each panel, the green curve 
shows the NFW halo profile, the blue curve indicates the 
two-halo term, the red curve is the BCG point mass term, 
the orange curve is the smoothed (miscentered) NFW 
halo component, and the purple dashed curve shows the 
non-linear correction. The magenta curve shows the sum 
of these terms. One can see that the model does a good 
job of fitting all of the features in the shear profiles, the 



most prominent of which is the one-halo to two-halo tran- 
sition, which usually occurs near T2qq. The best fit pa- 
rameters for ^200 , C200 and B, properly marginalized over 
the nuisance parameters, are shown in Tables [5] and [7J 
We show the values for mass and concentration converted 
to other mass definitions in Tables [8] and [9] The method 
of conversion is discussed in the Appendix. 

Figure [10] shows the best-fit models over-plotted on 
the inverted 3D mass profiles that were previously shown 
in Figure [3] Because the mass profiles are less noisy, 
they more clearly display the features in the data. The 
one-halo to two-halo transition is most prominent in the 
lowest richness and luminosity bins. 

5.1. The mass-richness relation 

Figurc[Tl]shows the inferred central halo mass-richness 
relations for both A200 and L200 richness measures. The 
red line in each case shows the resulting power-law fit to 
the relation. The fit to the mass-richness relation is 

AWA200) = M2oo|2o(A 2 oo/20) QN (26) 

with 

M200I20 - (8.8 ± 0.4 stat ± l.l. tf ,) x lQ 13 h- l M Q 

a N = 1.28 ±0.04. 
The mass-luminosity relation is found to be 

Af 2 oo(i 2 oo) = i 2 oo|4o(i2oo/40) Qi (27) 

with 

M 20 o|4o = (9.5 ± 0.4 stat ± 1.2 sys ) x l^h^M® 

a L = 1.22 ±0.04. 

The statistical error on the zero-point of both mass 
richness relations is about 5%. This includes the full 
marginalization over the other six model parameters. As 
discussed in £14.61 we need to include systematic errors 
due to shear calibration and possible photo- z biases as 
well as any remaining systematic biases in our modeling. 
We allow for a 3% shear calibration bias, a 7% photo- z 
bias and 10% for modeling biases, so this increases the 
error on the zero-point of the mass-richness relations to 
about 13%. 

To accommodate other conventions used in the litera- 
ture, power-law fits to the mass and concentration data 
for for alternate mass-scale definitions (see Tables |8] and 
O are shown in Tables [TO] and [11] 

While this seven-parameter model may appear overly 
complicated, it is necessary in order to properly account 
for the full uncertainty in modeling the cluster shear pro- 
files. For example, if we were to ignore miscentering 
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Fig. 6. — This shows the one-dimensional marginal posteriors for the 7 parameters for the seventh ./V200 richness bin. Most resemble a 
normal distribution The red lines indicate the prior normal distribution for each (with arbitrary normalization) . The first three In R200 1 
lnc2oo and B have uninformative priors whereas 5, In erg and lnVjvf. have constraining priors; i.e. the posterior resembles the prior so the 
data arc uninformative for these. The prior for Mq is constrained to be positive but is largely uninformative beyond that. 



TABLE 6 
Best Fit Parameters: N200 Bins 



< A/200 > 


A/200 ( lO 12 fc- 1 M ) 


r200 [h 


1 Mpc) 




C200 








B 




3.00 


6.37 ± 1.04 


0.28 ± 


0.015 


5 


78 


± 


1 


35 





07 


± 


0.03 


4.00 


9.77 ± 1.80 


0.32 ± 


0.020 


6 


17 


± 


2 


29 





11 


± 


0.04 


5.00 


14.63 ± 2.90 


0.37 ± 


0.024 


1 


15 


± 


1 


58 





17 


± 


0.05 


6.00 


21.35 ± 3.66 


0.42 ± 


0.024 


1 


33 


± 


1 


12 





13 


± 


0.06 


7.00 


23.31 ± 5.56 


0.43 ± 


0.034 


5 


77 


± 


2 


35 





18 


± 


0.07 


8.00 


27.86 ± 6.97 


0.46 ± 


0.038 


2 


31 


± 


1 


01 





25 


± 


0.09 


9.82 


44.14 ± 7.96 


0.53 ± 


0.032 


3 


97 


± 


1 


21 





19 


± 


0.07 


13.91 


60.01 ± 8.45 


0.59 ± 


0.028 


1 


22 


± 


1 


12 





23 


± 


0.08 


20.78 


95.96 ± 12.58 


0.69 ± 


0.030 


5 


82 


± 


1 


19 





25 


± 


0.10 


31.09 


167.76 ± 23.39 


0.83 ± 


0.039 


2 


95 


± 





66 





21 


± 


0.13 


50.27 


252.06 ± 35.28 


0.95 ± 


0.044 


4 


01 


± 





86 





46 


± 


0.20 


92.18 


568.81 ± 87.75 


1.25 ± 


0.064 


2 


92 


± 





76 





48 


± 


0.36 



Note. — This shows the best fit parameters of interest from the MCMC for the 
-^200 richness bins. We have marginalized over the four nuisance parameters. 
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Fig. 7. — The results of the MCMC chain for the seventh ^200 richness bin. This shows the marginal posterior distributions for all 21 
pairs of parameters. The red region is the 68% (1 a) confidence region; green is the 95% (2 a) confidence region and blue is the 99% (3 
a) confidence region. Although none of these parameters appear to be strongly degenerate at these noise levels, there is some correlation. 
-R200 is correlated with both q and Vm and C200 is correlated with q and Mo . These contours also allow for an estimate of how the best fit 
parameters might be biased if we have systematically misestimated our nuisance parameter priors. 
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TABLE 7 
Best Fit Parameters: L200 Bins 



< L200 > Afaoo ( lO 12 ^- 1 ^ ) r 200 (h' 1 Mpc) c 20 o B 



5.59 


7.87 ± 1.84 





30 


± 


0.023 


5 


31 


± 


2 


.39 





14 


± 


0.04 


6.97 


9.19 ± 1.91 





32 


± 


0.022 


5 


21 


± 


1 


.79 





13 


± 


0.04 


8.69 


13.62 ± 2.45 





36 


± 


0.022 


6 


86 


± 


1 


.88 





08 


± 


0.04 


10.84 


18.23 ± 3.22 





40 


± 


0.023 


1 


20 


± 


1 


.20 





14 


± 


0.05 


13.53 


29.65 ± 5.99 





.47 


± 


0.031 


1 


77 


± 


1 


.73 





16 


± 


0.06 


16.89 


37.44 ± 6.36 





50 


± 


0.029 


5 


20 


± 


1 


.70 





.24 


± 


0.07 


21.06 


41.79 ± 7.27 





.52 


± 


0.030 


3 


88 


± 


1 


.19 





22 


± 


0.07 


26.31 


59.58 ± 9.34 





.59 


± 


0.031 


4. 


99 


± 


1 


.47 





28 


± 


0.09 


32.89 


78.32 ± 11.88 





.64 


± 


0.033 


6 


01 


± 


1 


.62 





.24 


± 


0.10 


40.95 


97.25 ± 14.51 





69 


± 


0.034 


5 


41 


± 


1 


.47 





28 


± 


0.11 


51.19 


141.43 ± 23.27 





.79 


± 


0.043 


4. 


16 


± 


1 


.22 





21 


± 


0.12 


64.08 


204.05 ± 33.23 





.89 


± 


0.048 


2 


67 


± 





.75 





30 


± 


0.18 


79.89 


210.75 ± 35.03 





90 


± 


0.050 


1 


09 


± 


1 


.13 





16 


± 


0.12 


98.69 


235.24 ± 47.69 





93 


± 


0.063 


1 


11 


± 


1 


.41 





48 


± 


0.27 


124.59 


327.90 ± 62.23 


1 


.04 


± 


0.066 


3 


75 


± 


1 


.13 





17 


± 


0.31 


184.65 


610.42 ± 99.89 


1 


.28 


± 


0.070 


3 


15 


± 





.90 





39 


± 


0.31 



Note. — This shows the best fit parameters of interest from the MCMC for the 
L200 richness bins. We have marginalized over the four nuisance parameters. 



TABLE 8 
Mass Richness: N200 Bins 



< N200 > 


M200 


C200 




c 180b 






M500 


C500 


3.00 


6.37 


5.78 


8.27 


8.72 


7.41 


7.26 


4.72 


3.85 


4.00 


9.77 


6.17 


12.58 


9.28 


11.30 


7.74 


7.31 


4.12 


5.00 


14.63 


4.45 


19.66 


6.80 


17.35 


5.64 


10.34 


2.92 


6.00 


21.35 


4.33 


28.82 


6.63 


25.39 


5.49 


15.01 


2.84 


7.00 


23.31 


5.77 


30.25 


8.71 


27.08 


7.25 


17.24 


3.84 


8.00 


27.86 


2.34 


42.04 


3.71 


35.40 


3.03 


16.89 


1.46 


9.82 


44.14 


3.97 


60.39 


6.09 


52.91 


5.04 


30.49 


2.58 


13.91 


60.01 


4.22 


81.34 


6.45 


71.54 


5.35 


41.97 


2.76 


20.78 


95.96 


5.82 


124.42 


8.78 


111.44 


7.31 


71.09 


3.88 


31.09 


167.76 


2.95 


241.65 


4.60 


207.35 


3.78 


108.26 


1.88 


50.27 


252.06 


4.01 


344.24 


6.16 


301.85 


5.10 


174.52 


2.62 


92.18 


568.81 


2.92 


820.81 


4.56 


703.78 


3.75 


366.17 


1.86 



Note. — Maximum likelihood mean halo mass and concentration param- 
eters for each richness bin converted from our 200p c definition of virial mass 
into three other common definitions. The unit of mass is 10 12 /i -1 Mg. 



TABLE 9 

Mass Richness: L200 Bins 



< £200 > 


M200 


C200 




c 1806 






M 50 o 


C500 


5.59 


7.87 


5.31 


10.32 


8.04 


9.20 


6.69 


5.74 


3.52 


6.97 


9.19 


5.21 


12.09 


7.89 


10.76 


6.56 


6.68 


3.45 


8.69 


13.62 


6.86 


17.32 


10.29 


15.63 


8.59 


10.34 


4.61 


10.84 


18.23 


4.20 


24.72 


6.43 


21.74 


5.32 


12.74 


2.74 


13.53 


29.65 


4.77 


39.47 


7.26 


34.98 


6.03 


21.23 


3.14 


16.89 


37.44 


5.20 


49.25 


7.88 


43.85 


6.55 


27.22 


3.44 


21.06 


41.79 


3.88 


57.37 


5.97 


50.20 


4.93 


28.73 


2.52 


26.31 


59.58 


4.99 


78.81 


7.58 


70.01 


6.30 


43.00 


3.30 


32.89 


78.32 


6.01 


101.14 


9.06 


90.74 


7.55 


58.32 


4.01 


40.95 


97.25 


5.41 


127.27 


8.18 


113.56 


6.81 


71.18 


3.59 


51.19 


141.43 


4.16 


192.06 


6.38 


168.80 


5.28 


98.64 


2.72 


64.08 


204.05 


2.67 


299.66 


4.19 


255.11 


3.44 


128.33 


1.68 


79.89 


210.75 


4.09 


286.93 


6.28 


251.92 


5.19 


146.50 


2.67 


98.69 


235.24 


4.11 


320.05 


6.30 


281.08 


5.22 


163.68 


2.68 


124.59 


327.90 


3.75 


452.69 


5.78 


395.19 


4.77 


223.80 


2.43 


184.65 


610.42 


3.45 


854.38 


5.35 


741.71 


4.41 


409.17 


2.23 



Note. — Maximum likelihood mean halo mass and concentration parame- 
ters for each luminosity bin converted from our 200p c definition of virial mass 
into three other common definitions. The unit of mass is 10 12 h~ 1 Mq. 
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Fig. 8. — Model fits to AS(R) for the 12 ./V200 richness bins The model components are the NFW halo profile (green), miscentered halo 
component (orange), the central BCG (red), neighboring halos (blue); the non-linear contribution (purple dashed). The magenta curves 
show the sum of these components for the best-fit models in each bin. 
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Fig. 9. — Model fits to AS(iJ) for the 16 L200 luminosity bins. The model components are the NFW halo profile (green), miscentered 
halo component (orange), the central BCG (red), neighboring halos (blue); the non-linear contribution (purple dashed). The magenta 
curves show the sum of these components for the best-fit models in each bin. 



and shear non-linearity and include only the three pa- 
rameters C200)^200i an d B in the model fits, then the 
statistical uncertainty in the calibration of the cluster 
mass-richness relation would be only 3% instead of 5%. 
However, the halo mass estimates would be biased low 
by a factor of ~ 1.4. This factor arises because M200 
is determined mostly by the amplitude of AS on scales 
R < 1 h^ 1 Mpc, where the smoothed A'E S NFW (R) makes 
very little contribution; as a result, ignoring misccntering 
in fitting to the shear on small scales leads to an under- 



estimate of the mass by a factor of ~ p c . From the mock 
catalogs, we find (p c ) ~ 0.7, or l/{p c ) ~ 1-4. Therefore, 
halo misccntering has a large systematic effect on the es- 
timated halo masses and concentrations and so must be 
included. 

5.2. Halo concentration scaling relations 

Figure [12] shows the scaling of the mean concentra- 
tion C200 with halo mass. We have combined the re- 
sults from both richness (red points) and luminosity bins 
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Fig. 10. — The model fits of Fig. [8] and [9] over-plotted on the inverted 3D mass profiles for the 12 A^oo richness (left panel) and 16 L200 
luminosity bins (right panel). 





L 200 ( KTh-'L®) 



Fig. 11. — The inferred mean halo mass vs. richness (left panel) and mass vs. luminosity (it right panel) relations from the model fits to 
the lensing profiles. The red lines show the best-fit power-law relations (see text). 



(black points) on the same plot — these are not indepen- 
dent, since the same clusters are used for both. The blue 
curve shows the best-fit power law, 



0o 



C2oo(M 200 ) = c 20 o|i4(M 2 oo/10 14 /i- 1 M Q ) 
c 200|i4 = 4.1 ± 0.2 S ( a t ± 1.2 sys 
/3 C = -0.12 ±0.04. 



(28) 



The fit is performed with all data points from both bin- 
nings but the errors are adjusted upward by v2 so that 
they are not treated as independent data points. These 
results indicate that the halo concentrations, with typical 
values C200 — 5, depend only weakly on halo mass, as has 
been suggested by previous observational and theoretical 
results. Note that ignoring the parameters p Cl a s and Mq 
in the model fits would lead to a (biased) underestimate 



of the halo concentration parameter C200 by about a fac- 
tor of 3, as well as to unrealistically small error estimates 
on the concentration. 

For comparison with the lensing results, the green 
curve in Fig. [T^] shows the predicted concentration 
vs. mass relation from the halo formation mode l of 
iBullock et all (|200lh . Note that lBullock et all (|200ll ) use 
a different definition of halo mass M V i r and concentration 
c v i r , so we have converted their predictions to our param- 
eters M200 and c 2 oo following the translation given in the 
Appendix. In their model, the halo concentration is given 
by c v i r = K (a/a c ), where a = 1/(1 + z) and a c is the 
collapse epoch of the halo; the time at which the typical 
collapsed mass, M*, is a fixed fraction F of the halo mass, 
M*(a c ) = F M V i r . This model is defined by the two pa- 
rameters K and F, which are assumed to be independent 
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TABLE 10 

Mass Richness Power-law Fits: N200 Bins 



Mass type 


*^200|20 


ajv 


c 200|20 


Av 


-M200 


8.794E+13 


1.28 


3.99 


-0.15 


M 1S0b 


1.204E+14 


1.30 


6.14 


-0.14 




1.055E+14 


1.29 


5.08 


-0.15 


M500 


6.069E+13 


1.25 


2.60 


-0.16 



Note. Coefficients and exponents 

of the power-law fits of mass and concen- 
tration versus richness for the different virial 
mass definitions. The mass— richness relation 
and concentration-richness relation is of the 



M 2rj o|20 (AW20) Q « and c 



form M 

c 200|20 (-^20o/20)'' JV . The relative errors on pa- 
rameters are the same as the M200 versions (see 
text). 



TABLE 11 

Mass Richness Power-law Fits: L200 Bins 



Mass type 


-^200|40 




c 200|40 




A/200 


9.504E+13 


1.23 


4.37 


-0.15 


M 1S0b 


1.284E+14 


1.25 


6.68 


-0.14 




1.131E+14 


1.24 


5.54 


-0.14 


M500 


6.672E+13 


1.20 


2.86 


-0.16 



Note. Coefficients and exponents of 

the power-law fits of mass and concentra- 
tion versus luminosity for the different virial 
mass definitions. The mass— luminosity relation 
and concentration-luminosity relation are of the 
form M = M 200 |40 (^200/40)"^ and c = 
c 200|40 (-Z'200/40)' 31 ' . The relative errors on pa- 
rameters are the same as the M200 versions (see 
text). 

of cosmological parameters. Here M* is the non-linear 
mass scale at scale factor a in Press-Schechter theory, i.e., 
the mass for which D(a)cr(M^(a)) — S c , where the linear 
growth factor D(a) is given by Eqn. [131 <5 C =1.686 is the 
critical density in the spherical collapse model, and a(M) 
is the variance of the linear density field smoothed on the 
scale that on average encloses mass M. We choose the 
parameter values K = 2.9 and F = 0.001 (different from 
the original Bullock numbers), which have been demon- 
strated to reproduce the measured halo concentrations 
in a more recent set of LCDM dark matter simulations 
(|Wechsler et al.ll2006D . With those choices, the predicted 
concentrations of this galaxy formation model, shown as 
the green curve in Figure [T2l fit those inferred from the 
lensing data fairly well. The \ 2 between the two is 8 (for 
12 degrees of freedom) for the -/V200 richness binning and 
12 (for 16 degrees of freedom) for the L200 binning. In 
making this comparison, we have used the fiducial cosmo- 
logical parameters given at the end of SJTJ Furthermore, 
if we keep the Bullock F parameter and cosmological pa- 
rameters fixed we can determine the best fit Bullock K 
parameter from our data: Kfu = 3.00 ± 0.24 (assuming 
our fiducial cosmology with cr 8 = 0.8). 

Recently, iNeto et all (|2007l ) studied the concentra- 
tions o f halos identified fro m the Millennium Simu- 
lation (|Springel et alj I2005T ) and found a power-law 
relation for the average halo concentration, C200 = 
5.26(M 2O o/lO 14 ft" 1 M )- - 1 . The Millennium simulation 
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Fig. 12. — The mean NFW halo concentration parameter C200 
versus halo mass M2oo- Black points are from the shear profile 
fits for the I/200 luminosity bins and the red points are from the 
-^200 richness bins. The blue curve shows the best-fit power law 
to t he data (see text). T he green curve shows the prediction from 
the IBullock et all ll200ll) model with F = 0.001, K = 2.9, and 
our fiducia l cosm ology. The magenta curve shows the result from 
INeto et all 1120071 ) for the Millennium Simulation (adjusted to 2 = 
0.25). Note that this was fit to a cosmology with a slightly higher 
normalization (trg = 0.9 vs. erg = 0.8) and is thus expected to 
have slightl y higher concentrat ions. The purple dashed curve is a 
result from lBuote et aTl l|2007T l on X-ray clusters; the red dashed 
line sho ws a result from a compilation o f X-ray and strong-lensing 
clusters (Comcrford & Nataraian 2007) 



uses a flat LCDM cosmology with f2 m = 0.25, fif, = 
0.045, h = 0-73, n s = 1,0-8 = 0.9 and z = 0. 
IBullock et alj (|200ll ) found that halo concentration scales 
as 1/(1 + 2), which is consistent with recent observa- 
tional results from X-ra y clusters ; c oc (1 + z )-°- 71 ± - 52 
' Schmidt fc Allen! l2007t ) . We thus shift the INeto etal] 



2007f ) relation by 0.8 to put it at our median cluster 
rcdshift of z = 0.25; this is shown as the magenta curve 
in Fig. [T2J This result for dissipationless h alos agrees 
very well with both the IBullock et al] (|2001f) model and 
our d ata (x 2 = 8). Note that because the INeto et aLI 
(|2007t ) results are calculated for a cosmology with slightly 
higher normalization (erg = 0.9 vs. erg = 0.8) they are 
expected to have slightly higher concentrations and the 
agreement between the two models is even better than 
it looks in t he fig ure. The large difference shown in the 
INeto et al l (120071) pape r betw een their results and the 
results of IBullock et all (|200lT) are due to the fact that 
these authors used the orig inal lBullock et all (|2001f) val- 
ues for K and F, instead of the updated ones that we 
use here; with this change the two theoretical models 
are virtually indistinguishable, and are both in excellent 
a greement wit h our r esults. 

iBuote et al.l (|2007l ) have recently presented a deter- 
mination of the concentration-mass relation as mea- 
sured by a set of 39 clusters with X-ray measurements, 
finding c vir (l +z) = (9.0 ± 0.4)(Mvir/Mi 4 )-°- 172±0 - 026 . 
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This is plotted as the pu r ple-da shed line on Fig. 
fT2l IComerford fc Nataraianl ()2007l ) also recently com- 
piled several concentration measurements from individ- 
ual strong-lensing an d X-ray clusters (includ ing those of 
iBuote et al.ll2007l and ISchmidt fe Allenll2007f ). and found 
c vir (l + z) = (14.5 ± 6.4)(Mvir/M*)-°- 15=k) - 13 , where 
M* = 1.23 IO^/i^Mq at z = 0.25 for our fiducial cos- 
mology with (j g = 0.8. This is plotted as the red dashed 
line on Fig. HU 

In the figure, each of these relations is converted to our 
M200 system for comparison. Both results thus have a 
mas s scaling that is consisten t with results; although note 
that ISchmidt fc AUenl (|2007| ) have seen some indication 
for a steeping of this power at the highest masses from a 
sample of X-ray systems. 

These results all have a somewhat higher normaliza- 
tion than our data; there are many possibilities for this 
discrepancy. At least some of the discrepancy is likely 
due to selection effects between the samples. It is likely 
that X-ray clusters and strong-lensing clusters are more 
concentrated than average red-sequence clusters. In the 
X-ray case, they are chosen to be relaxed systems, whic h 
likely have higher concentrations (jWechsler et al.ll2002f ). 
This effect has been est i mated to be of the ord er ~ 10- 
20% (IBuote et al.ll2007t ISchmidt fc Alien! 12001 , but is 
still somewhat uncertain. Very concentrated clusters will 
be also more likely to produce strong-lensing features. 
Also, the X-ray flux is proportional to the square of the 
gas density and so X-ray selection also favors more con- 
centrated clusters. It is also possible that our model 
for miscentering is underestimated, which would reduce 
our modeled concentrations compared to the true halo 
concentrations. We may be able to better constrain the 
miscentering the future, and are also working towards 
measurements with a clearly well-centered cluster sam- 
p le to further investigate the se effects. 

iMandelbaum et alJ (|2006[ ) constrains the concentra- 
tion of typical halos containing SDSS luminous red galax- 
ies with galaxy-galaxy lensing. They find C18O6 = 5.6±0.6 
which is C200 = 3.8 ± 0.4, consistent with our results. 

5.3. Bias scaling relations 

Figure [13] shows the scaling of the mean effective bias 
parameter B BjS z\ function of halo mass. The lensing 
results arc well fit by a power law, indicated by the blue 
solid curve, 



B(M 20Q ) = B 200 |i4 (Ahoo/lO^h-'MQ)^ (29) 
B 2 oo|i4 - (0.26 ± 0.02 stat ± 0.02 sys ) 
a B = 0.38 ±0.02 

The fit is performed with all data points from both bin- 
nings but the errors are adjusted upward by \pl so that 
they are not treated as independent data points. As the- 
oretically expected, the clustering strength, i.e., the bias, 
increases with halo mass. 

As above, it is of interest to compare these results with 
the predictions of structure formation models. The halo 
bias can be com puted using the "peak- background split" 
(IMo et al.lll996l:ISheth fc Tormenll 19991 ). We consider the 
model of ISheth et al. I (|2001l) . which is derived from the 
elliptical collapse model and calibrated with N-body sim- 
ulations. In their bias relation, the halo mass is defined 
in terms of the region within which the mean density 
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Fig. 13. — The effective bias parameter (the coefficient of the 
two-halo term) B = b{M2oo)Vt m a^D 2 (z) versus M200: black points 
show lensing results in luminosity bins, red in N200 bins. The blue 
curve is the best-fit power la w (see text). The th ree dotted curves 
show the predictions from the Shcth ct al. (2001) elliptical collapse 
model for three values of as: (1.0, 0.8, 0.6), from top to bottom. 

is 180 times the mean density of the Universe at red- 
shift z, Migob = M{r 1S0b ) = 4/37rr3 80b 180p(z). Using 
the formulas of the Appendix, we convert between this 
defi nition and our expr ession for M200 given in Eqn. 21 
The ISheth et"~aTl (l200l halo bias relation is given by 



b(v) = 1 + 



1 



yfa~{av 2 ) + b^/h:{au 2 ) 1 - c 



{av 2 f 



{av 2 Y + 6(1- c)(l- c/2) 
(30) 



where S c = 1.686 and v = S c /(D(z)a(M)). ISheth et all 
(|2001[ ) chose parameters a = 0.707, b = 0.5, and c = 
0.6 to agree with N-bod y si mulations. Howeve r, both 
iSeliak fc WaTrenl (l2004bD and lTinker et~aT1 ((20051) deter- 
mined that this relation over-estimates the bias at fixed 
halo mass by about 20%, espec ially for masses less than 
the non-linear mass scale M*. iTinker et all (|2005t ) find 
that this expression gives a better fit to the simulations 
for a = 0.707, b = 0.35, and c = 0.8, and we adopt these 
parameter values to compare with the lensing results. 

One effect that needs to be included is that we are not 
measuring B(M 2 oo) exactly but rather (£>(A/ 2 oo)) where 
the average is over the log-normal distribution of mass. 
Similarly, we arc plotting these versus (M2oo)- There- 
fore, to compare the theoretical predictions to the data 
we need to multiply the theoretical predictions at (M200) 
by (B) /B((Mwo)) which is exp(V M a B {a B - l)/2) for 
a log-normal distribution. Here, a B = 0.38, is the loga- 
rithmic slope B (Af2oo ) ~ -^200 an< l is the variance of 
In A'/2oo ■ This correction varies with richness but is typi- 
cally about 10% and adds about 5 — 10% uncertainty to 
the predictions depending on the width of the prior dis- 
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tribution for Vm- With our (probably overly generous) 
prior of 0.6 for lnVjvf, this uncertainty is 10%. 

The resulting theoretical expressions for B are plot- 
ted as the dashed lines (black, magenta, and green) in 
Fig. [13] for three values of erg: 0.6,0.8, and 1.0. These 
correspond to non-linear masses, M„: 0.43,1.23, and 
5.26 x I0 12 h- 1 M Q at z = 0.25. Although the predic- 
tions for all three choices are within ~ 30% of the best- 
fit relation from lensing, the data appear to prefer lower 
values of <7g. The x 2 s are acceptable for both erg = 0.6 
(X% = 7, Xl = 12) and a 8 = 0.8 ( X % = 15, X \ = 20) but 
formally unacceptable for erg = 1 (x% = 32, x L = 36). 
The number of degrees of freedom is 12 and 16 for x% 
and Xl respectively. These x 2 numbers do not include 
the above mentioned Vm uncertainty and so can be re- 
duced by another 10-20%. We refrain from drawing cos- 
mological conclusions from this comparison for several 
reasons. First, in fitting the halo model to the lensing 
results, we assumed particular values for the cosmologi- 
cal parameters (except erg) when we calculated the linear 
correlation function £; (Eqn. I14[) for the two-halo term. 
For a self-consistent cosmological constraint, we would 
need to float the cosmological parameters in calculating 
the two-halo term for the lens model fit. It would also 
be desirable to allow for possible scale-dependent bias, 
since the predicted non-linear correlation function at the 
largest scales we probe, 25 — 40 fe -1 Mpc, differs sli ghtly 
from the linear theory prediction (|Smith et al.ll2007[). We 
would also want to consider halo-exclusion effects JZhend 
120041 ). We believe that precise prediction of the bias in- 
volving all of these effects at these intermediate scales is 
not yet possible but clearly the quality of data is improv- 
ing to the point where such study is now warranted. 

It would be better to extend the lensing measurements 
to slightly larger scales (> 50 h' 1 Mpc comoving) in 
order to reduce this effect and, more importantly, to iso- 
late the large-scale bias measurements from degeneracies 
with the NFW halo parameters. Finally, to reliably es- 
timate cosmological parameters we would require data 
with better signal-to-noise ratios as well as more pre- 
cise shear and photometric redshift calibration. In future 
wide-area, deeper lensing surveys, these conditions will 
all be met, and constraints on cosmology from lensing 
measurements of the halo bias will be possible. 

5.4. BCG-halo mass scaling relation 

We have included this point mass term in our model 
mostly to allow enough freedom so that the concentra- 
tion measurements would not be overly influenced by the 
first few data points. This is especially important when 
non-linear shear is considered. However, the relation be- 
tween the BCG mass and the central halo mass may be 
of interest in itself. Figure [14] shows the point mass 
term, Mo, plotted versus the mean central halo mass, 
M 2 oo- The point mass increases with central halo mass 
but seems to plateau at an asymptotic mass of about 
1.3 x 1O 12 /i~ 1 M0. The blue curve is simply a fitting 
function: M = p /(l + (A/ 20 o/pi) P2 ) with best fit val- 
ues po = 1.334 x KF/i^Mq, Pl = 6.717 x IO^/i^Mq 
and P2 = —1.380. These masses arc consistent with 
the expected m asses of galaxy ha los. Strong-lensing 
constrains (e.g. iRusin et al.l (|2003T show that nearly 
every strong lens is well fit by an singular isothermal 
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Fig. 14. — The BCG point mass vs. mean central halo mass 
for both richness bins: ./V200 (red) L200 (black). The point mass 
term increases with central halo mass at low halo mass but flattens 
out at about 10 12 h~ 1 Mq. The blue curve is a fitting function: 
M = po/(l + (Af 2 oo/pi) P2 ) with po = 1-334 W^h^M®, pi = 
6.717 lO 13 rt _1 M and p 2 = -1.380. The point mass is roughly 
consistent with the masses of luminous red galaxies from strong- 
lensing constraints. 

sphere out to at least 100/i _1 kpc. The 3D mass of 
a singular isothermal sphere is given by Msxs(r) = 
4.64 x 10 12 h~ 1 M Q (er s /100 km/s) 2 (r/Mpc) which at 
25k- 1 kpc gives 1.2 x 10 11 , 4.6 x 10 11 and 1 x 10 12 /i _1 M Q 
for stellar velocity dispersions, a s =100, 200 and 300 
km/s respectively. This mass range agrees well with our 
point-mass values. This comparison is inexact since SISs 
and point masses have different shear profiles. A precise 
measurement of the mass density of the central BCG 
would be better s uited to a combinatio n of strong and 
weak lensing (e.g. iGavazzi et al.l (|2007f and is beyond 
the scope of this paper. 

6. COMPARISON OF LENSING AND DYNAMICAL MASS 
MEASUREMENTS 



iBecker et al.l (|2007| ) have recently esti mated statisti- 
cal mas ses of MaxBCG clusters from the iKoester et al.l 
(|2007bf) catalog from stacked velocity dispersion mea- 
surements. Using galaxies near each BCG with measured 
spectroscopic redshifts, they build a richness-dependent 
histograms of velocity differences and fit the shape to 
a summed, log-normally distributed, set of Gaussians. 
Results show that the geometric mean velocity disper- 
sion scales as a power law, a v ~ ^ 2 6o 36± ° i with 
a log-normal dispersion that declines from 0.40 ± 0.02 
at N 200 = 10 to 0.15 ± 9 at N 200 = 88. Al- 
though the typical maxBCG cluster contains few galax- 
ies with spectroscopic redshifts, it is the case that, as 
with cross-correlation lensing, the velocity histograms 
can be stacked from many clusters within a rich- 
ness bin to build a high signal-to-noise histogram of 
the average velocity differences. The best-fit veloc- 
ity dispersion implies a mass, M 20 o, derived from the 
dark matter virial relation, ctdm(M 2 oo, z) = (1082.9 ± 
4.0 km/s)(/i(z)M 200 /10 15 M Q )°- 3361±0 0026 , calibrated re- 
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Fig. 15. — Mean halo mass vs. richness from t he lensing profiles 
(black points, the same as those shown in Fig. Hip and from dy- 
namical galaxy velocity dispersion measurem ents (red points) from 
the same cluster sample (Becker ct al. 2007). The two scaling rela- 
tions are in good agreement, although the lensing data provides a 
tighter relation. Black curve shows the best-fit power-law relation 
from the lensing data. 



centl y from a suite of N-body simulations (jEvrard et alJ 
|2007| ). Galaxy and dark matter dynamics may differ, 
and this potentiality is approximately treated as a con- 
stant velocity bias parameter, b v = ct„/<7dm- Figure [TBI 
shows the mean virial mass estimates from the dynamical 
measurements in red and the lensing halo masses (from 
Fig. [TTj) in black. The line shows the be st-fit, power- 
law re lation from the lensing masses. The iBecker et al.1 
(|2007l) error bars include systematic errors inherent in 
their method, and arc correlated. The statistical lensing 
and dynamical mass estimates appear in very good agree- 
ment, but system atic uncertain t ies in b v remain to be un- 
derstood. When IBecker et al.l (|2007t ) tested their virial 
mass estimator on mock SDSS maxBCG catalogs, they 
found it to systematically underestimate halo masses by 
25%. This correction factor has not been applied to the 
estimates in Fig. [15] Including it would elevate b^AI vil 
above the lensing masses, suggesting a positive velocity 
bias, b v ~ 1.1. The current level of agreement indicates 
that the velocity bias parameter is not significantly dif- 
ferent from unity. We defer a more formal analysis of 
these issues to future work. 

7. DISCUSSION 

In Paper I of this series, we demonstrated that cross- 
correlation weak lensing can be measured around clus- 
ters of galaxies out to large radii, R ~ 30 h~ x Mpc. In 
this work, we have shown that these mean shear profiles 
are well described by realistic models derived from N- 
body simulations. Our primary results are the lensing 
calibration of the cluster halo mass-richness and mass- 
luminosity relations and measurements of the scaling re- 
lations between the mass, bias, and concentration of ha- 
los. We also show that lensing-inferred masses are consis- 
tent with e stimates from stack ed velocity dispersion mea- 
surements (|Becker et al.120071 ) as long as the velocity bias 



parameter is not significantly different from unity. The 
scaling relation between halo concentration and mass 
that we derive from lensing agrees well w ith the results 
of N- body simulatio ns (e.g., the model of iBullock et ail 
l200ll as u pdated by IWechsler et aTl 120061 or the recent 
results of iNeto et al.l l2007h . The scaling between halo 
bias and mass from lensing is in agreement with th e 
simulation-calibrated predictions of (ISheth et al.ll200il ). 

In this work, we have limited the analysis to the mod- 
eling of the lensing profiles. However, for completeness 
we now describe some cosmological applications of these 
results that are now possible. We then conclude by sug- 
gesting some applications of these methods that will be 
possible with the ambitious wide-field surveys now being 
planned. 

Perhaps the most obvious application of the mea- 
sured halo mass-richness relation is to mea surement of 
the ma ss function of clusters. Previously, iRozo et all 
(|2007bl ) completed a first analysis constraining cosmol- 
ogy through the cluster mass function using the same 
SDSS cluster catalog. They modeled the mass-richness 
relation using the Halo Occupation Distribution (HOD) 
model, without a strong observational prior on the 
mass-richness relation itself. In this model, one adopts 
a param etrized mass-richness relation, and IRozo et al.l 
(|2007bf ) employed tight cosmic microwave background 
and SN la priors on cosmological parameters except for 
erg, for which only a non-informative prior was used. 
They found a$ = 0.92 ± 0.1 and derived co nstraints on 
the H OD model parameters. This method (|Rozo et al.1 
l2007a|) employs marginalization over a generous supply 
of nuisance parameters that connect the observables to 
mock catalog predictions (Wech sler et al. in prepara- 
tion) . While IRozo et "aTl (|2007bf ) represents one of the 
more robust measurements of ag from the cluster mass 
function, an update to this work using the mass-richness 
relation derived from lensing is in progress. This should 
allow for tighter constraints on erg, a tight constraint on 
fi m , as well as a more precise measurement of the HOD 
p arameters. 

iMan delbaum & Scliakj (|2007l ) have put a lower bound 
on cr 8 (r2 m /0.25) - 5 > 0.62 at 95% C.L. by employing 
a method simpler than full modeling of cluster number 
counts. They argue that the lensing signal around a sam- 
ple of isolated luminous red galaxies in the SDSS could 
not be produced by low values of erg since too few clusters 
would have formed. Interpretational complications such 
as incompleteness of the cluster sample or misccntering 
would only decrease the predicted signal, so their bound 
should be robust. 

There are several ways in which measurements of 
stacked lensing profiles around clusters can be used to 
derive entirely new constraints on cosmology. The am- 
plitude of the linear galaxy or cluster auto-correlation 
function measures the combination fo 2 erf _D 2 (z), whereas 
lensing measures ba^D 2 (z)fl m . Combining both galaxy 
or cluster auto-correlations with lensing will thus allow 
one to measure the two combinations Q m asD(z) and 
b/Q m . By combining these two measurements into an 
estimate of VL m a%D(z), it is possible to directly probe 
the growth of structure. The linear growth factor is sen- 
sitive to cosmological parameters affecting the Hubble 
parameter, such as Sl m , as well as to dark energy and 
spatial curvature. This growth measurement would com- 
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plement geometric probes of dark energy such as type la 
supcrnovae and baryon acoustic oscillations. In addition, 
this measurement of the growth factor would complement 
cluster number counts since it extracts information from 
much larger scales. Measuring dark energy through this 
direct measurement of the growth factor would have very 
different systematics from both cosmic shear and cluster 
number counts. This measurement will most likely re- 
quire lensing measurements extending to slightly larger 
scales, 50 — lOO/i" 1 Mpc, to better tie down the B param- 
eter, as well as more attention to systematic errors such 
as shear calibration and photo-z biases. Since it relies 
on large-scale information, it will require a deep survey 
over a large fraction of the sky to reduce the cosmic vari- 
ance to a small enough level to compete with the other 
methods. 

iSeliak et al.1 (|2005h employed a similar technique to 
constrain os by using lensing to constrain halo masses so 
that halo biases could be predicted and used to "de-bias" 
the galaxy power spectrum. This however requires the 
complication of HOD modeling to connect galaxies to the 
halos that they occupy. It could be simpler to apply this 
idea directly to clusters, since this requires only large- 
scale auto-correlation function (or power spectrum) mea- 
surements and does not require large-scale lensing mea- 
surements. This approach docs, however, rely on mod- 
els for the bias predict ion (e.g. ISheth fc Tormenl 1 19991 : 
ISeliak fc Warren! l2004bl ) . so direct measurement of the 
bias would be preferred as long as the errors are suffi- 
ciently small. 

Future weak len s ing surveys su ch as SNAP 
(1SNAP Collaboration! [2005h. D UNE jRefregier et all 
I2006IV LSST dTvson et all l2 0l) and DES 



(jDark Energy Survey Collaboration l2005h . would 



be ideal for these types of measurements. The statistical 
errors on the average shear in a radial bin should be 
at the percent level for these surveys, compared to 
50% for the SDSS cluster data for identical binning. 
Since the dark energy constraints from measurement 
of the shear power spectrum will already require shear 
calibration and photo- z biases below the percent level, 
this would suggest that these surveys should be able 
to measure halo masses, concentrations, and biases at 
about the percent level for perhaps hundreds of richness 
bins. Entirely new ways of using lensing to constrain 
cosmology may be possible. For example, baryon 
acoustic oscillations should leave their imprint on the 
AS profile at comoving scales of lOO/i -1 Mpc and will 
be detectable with surveys such as these. Determining 
how to extract the most information from such a data 
set should remain a fruitful area of study. 
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"Virial-type" mass definitions all have the form 

M n = 
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M(r a ) = —r 3 a A a Pa 



CD 



where M(r) is the mass profile. The number A a may be a function of cosmology and redshift. Typical choices for 
A a are 200 and 180. The density p a is always some variation of the critical density, Pcriti but it may be Pcrit 
or p(z) = p C rit(z — 0) (1 + z) 3 . Let us define D a = A a p a , since the conversion between different conventions only 
depends on this product. 

For any two choices of D a , there is a conversion between them for the mass M a , (or equivalently r a ) and the NFW 
concentration parameter c a . M a and c a (unlike r s and r a ) are independent of the choice between physical and comoving 
u nits. 

IHu fc Kravtsovl (|2003l ) discuss this issue but we will review the conversion again here. 
The NFW form for the density profile is given by 



p{r) 



(r/r s ) (l + r/r s ) 2 ' 

Under this assumption, the mass profile for some choice of mass definition is given by 



(2) 



where 



M(r) = 4np s rl f(r s /r a ) 



(3) 



[Mi 



(4) 



and the concentration is defined as c a = r a /r s . The parameters r s and p s are independent of the choice of D ai so 
for any other choice Db we have 3p s = D a / f(l/c a ) = D^j /(l/c&). Therefore we have the conversion between the two 
concentrations, 
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1/C6 = r 1 (j± f(i/c a ) 



(5) 



Similarly, r s = r a /c a = rb/cb, so the conversion for these "virial" radii is Tb = r a Cb/c a , and the conversion between 
masses is 

M h = M a ^( C ±)\ (6) 

D a \CaJ 

The inverse function of / needs to be computed with a look-up table and interpolation since a simple closed-form 
expression does not exist. However, the conversion simply depends on the ratio Db/D a . 

An as example, we consider the two most common choices, Z?200c = 200p cr it(z) — 200p cr it(0)H 2 (z) / Hq = 
200/> crit (0)[f2 m (l + zf + (1 - O ro )] (in a flat LCDM universe) and D 18ab = 180p(z) = 180/w(0)(l + zf fi m . 
The ratio of these is 

^180fc _9_ „ n _9_ (1 + Z) 3 

^200c 10 '"^ 10 fi ro (l + Z )3 + (l_O m ) w 

We use this formula to convert our measured masses M 2 oo c to M 18 ob, using z = 0.25 and f2 m = 0.27, which gives 
Di&ob/ D-2ooc = 0.377. We use this conversion to compute the halo bias, since it has been shown to be nearly universal 
when expressed in the Disob mass definition. 

Similarly, to calculate the halo concentration usi ng the iBullock et al.l (|200lD model, we need to convert M200 to 
M v i r . This conversion uses (|Brvan fc Normanlll998T ) 

18tt 2 + 82x - 39x 2 

1 + X 

with x = Q m (z) — 1. This results in 

Dvir _ 18tt 2 + 82x - 39x 2 

D^~ c = 200 ■ () 
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